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Abstract — Smart energy grid is an emerging area for new 
applications of machine learning in a non-stationary environ- 
ment. Such a non-stationary environment emerges when large- 
scale failures occur at power distribution networks due to 
external disturbances such as hurricanes and severe storms. 
Power distribution networks lie at the edge of the grid, and 
are especially vulnerable to external disruptions. Quantifiable 
approaches are lacking and needed to learn non-stationary 
behaviors of large-scale failure and recovery of power distri- 
bution. This work studies such non-stationary behaviors in three 
aspects. First, a novel formulation is derived for an entire life 
cycle of large-scale failure and recovery of power distribution. 
Second, spatial-temporal models of failure and recovery of power 
distribution are developed as geo-location based multivariate 
non-stationary GI(t)/G(i)/oo queues. Third, the non-stationary 
spatial-temporal models identify a small number of parameters 
to be learned. Learning is applied to two real-life examples of 
large-scale disruptions. One is from Hurricane Ike, where data 
from an operational network is exact on failures and recoveries. 
The other is from Hurricane Sandy, where aggregated data is 
used for inferring failure and recovery processes at one of the 
impacted areas. Model parameters are learned using real data. 
Two findings emerge as results of learning: (a) Failure rates 
behave similarly at the two different provider networks for two 
different hurricanes but differently at the geographical regions, 
(b) Both rapid- and slow-recovery are present for Hurricane 
Ike but only slow recovery is shown for a regional distribution 
network from Hurricane Sandy. 
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I. Introduction 

Non-stationary modeling and learning have been widely 
applied to many applications (TJ El. This work contributes 
a new application in an emerging area of smart energy grid. 
The application is on learning from failure data how distributed 
power networks respond to external disturbances such as 
hurricanes. Learned knowledge provides understanding how 
power networks fail and recover in severe weather. Such 
understanding is a prerequisite of modernizing our power 
infrastructure. 

Power distribution networks lie at the edge of the energy 
grid, delivering medium and low voltages to residence and 
organizations [3]. Distribution networks consist of leaf nodes 
of the energy infrastructure and are thus susceptible to external 
disturbances. For example, natural disasters cause wide-spread 
destructions and service disruptions to distribution networks 



p) (5). There were about 16 major hurricanes and severe 
storms occurred in north America in the past 5 years [6], each 
of which disrupted electricity services from 500,000 to several 
million customers for days |6|. 

Existing approaches rely primarily on empirical approaches 
for large-scale failures of power distribution. For example, 
empirical studies have been conducted on assessing dam- 
ages from large-scale power failures (see [7] and references 
therein). Monitoring systems have been used in power industry 
to respond to failures (see (8) as examples). As hurricanes 
and severe storms appear to occur frequently and at a large- 
scale [6 1, empirical approaches become inadequate for real 
time failure assessment in a wide geographical area (9). 
Furthermore, recovery from large-scale power failures is even 
less understood. This is evidenced by how difficult it was 
for utilities to provide accurate recovery time to customers 
(9). Overall, quantifiable approaches are lacking and needed 
for characterizing how power distribution networks respond 
to external disturbances. This is important for discovering and 
mitigating vulnerabilities for enhancing the power infrastruc- 
ture p) (TTJ. 

Unique challenges emerge for quantifying how power distri- 
bution networks respond to large-scale external disturbances. 
The first is randomness. External disturbances such as hurri- 
canes exhibit random behaviors. The resulting power failures 
occur randomly also. The second is dynamic nature of failures 
and recoveries due to evolution of external disturbances. For 
example, a hurricane usually has a landfall with a strong 
force wind, and then gradually dies down when moving in 
land. Hence, non-stationarity (randomness and dynamics) is 
an intrinsic characteristic of large-scale failures. 

Non-stationary learning is a natural approach for quantify- 
ing non-stationary large-scale failure and recovery of power 
distribution induced by external disturbances. However, an 
additional challenge for learning is lack of data. This may 
appear to be a paradox: A large-scale external disturbance 
such as a hurricane often results in thousands of power 
failures, which amounts to a lot of data. However, in the 
space of external disturbances, a hurricane generates only one 
sample, i.e., a snap-shot of network failures and recoveries 
from one external disturbance. Hence, data from an individual 
disturbance is valuable and should be used to enable learning. 
Note that using real data for studying large-scale power 
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failures and recoveries is not yet a common practice for the 
power infrastructure. Real data on power failures from external 



disturbance is rare fl2) 1 13 1. A recent work shows the strength 
of combining algorithmic approaches with real data on geo- 
graphically correlated power failures [ 14]. The focus there is 
on power transmission rather than distribution. 

Incorporating all challenges, a basic research question we 
intend to answer is how to learn non-stationary behaviors of 
large-scale failure and recovery for distributed power distribu- 
tion, using real data from one external disturbance? Combining 
model-based and data-driven methods is a viable approach for 
limited samples fi3). A model identifies pertinent quantities 
that determine non-stationary random processes of failure and 
recovery. We first derive a problem formulation to obtain a 
model. What remains unknown are model parameters, which 
can be learned from data. Such a combination of model-based 
and data-driven approaches directs learning to a small number 
of functions or parameters, and thus makes effective use of 
data. In addition, a combination of model-based and data- 
driven approaches makes learning explanatory: Learned model 
parameters bear physical meanings on how distributed power 
distribution responds to an external disruption. 

Our formulation focuses on power failures and recoveries 
induced by exogenous weather. The time scale of such failures 
and recoveries is considered to be a minute to be consistent to 
that of a hurricane (see Section [V] for details). Power failures 
can also occur in bursts at a small time scale of seconds or 
less [16|. Such bursty failures are usually due to an internal 
network structure (see Section [V} and not studied in this work. 
Self-recoveries often occur at the small time scale of sub- 
seconds 1 16 1 whereas recovery by field crews occur in minutes 
or beyond. Hence our model at the time scale of a minute 
focuses on weather induced failures and recoveries that can 
not be repaired through self-healing. Such a model provides 
understanding how distributed power infrastructure responds 
to external disturbances. 

Our formulation begins with the spatial scale of network 
nodes and the temporal scale of a minute. As the data from 
an external disturbance is insufficient to completely specify 
a detailed temporal-spatial model fTT) , we aggregate spatial 
variables into groups. A group can be a city that consists 
of nodes from a small geo-graphical area. The resulting 
model thus characterizes an entire non-stationary life-cycle of 
large-scale failure and recovery in time and at geo-locations. 
Such a spatial-temporal model is multivariate generalization of 
GI(t)/G(t) /oo queues [18| to include geo-locations. G/(i)'s 
and G(t)'s are arrival (failure) processes and departure (recov- 
ery) processes for individual geo-graphical area, "oo" means 
that it is possible for recovery to occur immediately after a 
failure, e.g., less than a minute in this work. Hence, multivari- 
ate GI(t)'s and G(i)'s constitute our model that completely 
specify non-stationary behaviors of large-scale failure and 
recovery at a power distribution network. 

We consider one simplified characterization of 
GI(t)/G(t)/oo queues to the expected values (18| . What 
to learn then becomes clear: A small number of pertinent 
parameters of GI(t) and G(t) at different geo-locations, i.e., 
failure rates and recovery time distributions. We first obtain 



detailed data on large-scale power failures from a real life 
example of a natural disaster, Hurricane Ike. Ike caused power 
failures in the south states of US and affected more than 2 
million users in 2008. We devise learning for two scenarios 
using the real data. The first learns only temporal processes of 
non-stationary failure and recovery by aggregating over spatial 
variables of nodes in an entire network. The second learns 
geo-location based spatial-temporal processes by aggregating 
nodes in cities. We show the modeling facilitates learning 
where model parameters can be easily estimated using the 
failure data. We then apply the model to another data set 
from Hurricane Sandy. Hurricane Sandy caused wide-spread 
power failures to more than 8 million people in the northeast 
of US in 2012. The data set consists of aggregated rather 
than detailed power failures in one of the impacted areas. 
Our approach is shown to be applicable to the aggregated 
data for estimating failure and recovery rates. Our approach 
also shows what can not be learned using aggregated data. 

In summary, the contribution of this work consists of the 
following: (a) a novel model based on non-stationary random 
processes and dynamic queues for weather-induced large- 
scale failure and recovery of power distribution, (b) simple 
learning approaches for estimating parameters of the non- 
stationary model, (c) applications of the model and non- 
stationary learning to real data from two hurricanes at different 
locations. 

The rest of the paper is organized as follows. Section [II] 
provides background knowledge and an example of large- 



scale failures at a power distribution network. Section III and 



IV develops a problem formulation of spatial-temporal non- 
stationary random processes. Section [V] describes the real data 
from Hurricane Ike and leams a geo-temporal model. Section 
VI studies non-stationary failure and recovery using parts of 



real data from Hurricane Sandy. Section VII discusses our 



findings. Section VIII concludes the paper. 



II. Background and Example 

We now provide examples on the temporal scale, and non- 
stationarity of failure and recovery. 



A. Time Scale of Failure and Recovery 

We first discuss the time scale for modeling weather induced 
failures and recoveries. A power distribution network consists 
of components such as substations, feeders, transformers, 
power circuits, circuit breakers, transmission lines, and meters. 
An example power distribution system is illustrated in Figure 
[T] with a commonly used radial topology. Three types of 
components are shown for illustration: A primary substation, 
three secondary power sources, and loads. Links correspond 
to power lines. Assume that either a component or a link can 
fail during a hurricane. Assume that the substation is used 
as a primary source during normal operation. The secondary 
sources, that can be distributed renewable sources, are used 
for back-up when the primary source fails fl9) . Then the 
following scenarios can occur for failure and recovery: 

(a) If all the sources fail due to an external disturbance, 
there is no electricity supply to any loads. Hence, the loads 
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Fig. 1: A Section in A Distribution Network. 
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Fig. 2: Empirical temporal distribution of failure durations in 
3D. 



experience dependent failures that can occur instantaneously. 
The scenario of dependent failures also applies to other 
components upstream in a radial topology that cause loss of 
electricity at nodes downstream. Dependent failures are often 
experienced by loads within sub-seconds. 

(b) If a link that connects a load to the network fails due 
to an external disturbance, there is no electricity supply to 
the load. Such link failures can occur independently due to 
fallen trees or power lines. Thus loads experience independent 
loss of electricity. As such independent failures are caused 
by exogenous weather, they are assumed to occur at a time 
scale of a minute or beyond. Such a time scale can be 
estimated through how rapidly a hurricane force wind passes 
a city. Consider a small city of 1.600 acres as an example. 
Based on the IEEE standard (IEEE/ASTM SI 10-1997) (2D), 
an approximated "diameter" of the city is about 1.6 miles. 
Consider the speed of the force wind at 60 miles per hour. 
It takes about 1.6 minutes for the wind to pass the city. This 
provides a basis of using a minute as a time scale of weather- 
induced failures. 

(c) Recovery depends on the types of failures and recovery 
schemes. Certain failures can be repaired through self-recovery 
JT6) . For example, if the primary substation fails, the elec- 
tricity supply to all loads can be recovered when the three 
secondary sources are in operation. In general, self-recovery 
and automated reconfiguration built in power distribution 
usually operate at a time scale of sub-seconds or seconds 1 16 1. 
However, failures due to external disturbances, e.g., falling 
trees and power lines, often require manual repair by field 
crews. Recovery time depends on not only restoration schemes 
but also environmental constraints, and is thus considered as 
random in this work. Such manual recovery time is in either 
minutes or hours or days from failures. 

In summary, failures and self-recoveries at a small time- 
scale of seconds or sub-seconds depend on detailed network 
structure and self-recovery schemes. Failure and recovery at 
a larger time scale of a minute and beyond are often due to 
external disturbances that evolve dynamically and randomly. 



B. Example of N on- Stationary Failure and Recovery 

To gain intuition on an entire life cycle of failure and 
recovery of a distribution network, we consider a real-life ex- 
ample of large-scale power failures occurred during Hurricane 
Ike in 2008. Figure [2] shows a histogram on failure occur- 
rence time and duration at an operational distribution network 
before, during and after the hurricane. Each bin has length 
(failure occurrence time) of 1 hou^] and width (duration) of 
4 hours. The height of each bin represents the number of 
failures that occur at time t and last for duration d. Figure [3] 
shows geographical distributions of failure occurrences at two 
different time epochs, where failure occurrence is evidently 
non-stationary across geographical regions. Hence, 

(a) Failure occurrence is non-stationary, i.e., random and 
time- varying; 

(b) Recovery time is non-stationary, i.e., obeys different 
probability distributions for failures occurred at different time; 

(c) Failure occurrence and recovery time are also non- 
stationary spatially, i.e., exhibit different distributions for dif- 
ferent geo-locations. 

Hence, samples on failure occurrence time and duration 
are not identically distributed but exhibit geo-temporal non- 
stationarity. 

C. N on- Stationary Learning 

Non-stationary random processes have been studied in the 
context of drifting concepts (see ]2"T) (22) 1 23 1 (24) and refer- 



ences therein). Samples for learning are dynamically drawn 
from a non-stationary environment. An issue arises on the 
sample size, i.e., whether data is sufficient for characterizing 
underlying drifts of distributions. 

The problem of learning non-stationary processes in this 
work exhibits unique challenges in terms of sample size. For 
simplicity, batch data is assumed to be collected for learning 
an entire non-stationary life cycle of failure and recovery 
processes off-line. A challenge here is that there is only one 
snapshot of a distribution network in space and time from one 
external disturbance. The number of data sets is often small, 

'CDT is used for all plots for Hurricane Ike. 




Fig. 3: Geo-locations of failures occurred in different time 
durations. Red marker: Failures occurred from 7 p.m to 8 p.m. 
Sep. 12. Yellow marker: Failures occurred from 5 a.m. to 6 
a.m. Sep. 13. 



i.e., from a few severe storms. Therefore, combining model- 
based and data-driven approaches becomes important, where 
data can be used to learn a small number of model-parameters 
from one external disturbance at a time [ 15 1. In addition, com- 
bining model-based and data-driven approaches for learning is 
required by the problem: Learned model parameters need to 
exhibit physical meaning for generic network behaviors upon 
external disturbances. 

III. Stochastic Model 

We now formulate large-scale failure and recovery based on 
non-stationary random processes. We begin with the detailed 
information on nodal statuses in a distribution system. We then 
aggregate the spatial variables of nodes to obtain temporal 
evolution of failure and recovery across geo-graphical areas. 

A. Failure and Recovery Probability 

A geo-temporal random process provides a theoretical basis 
for modeling large-scale failures. The temporal variable is time 
t that is assumed to be continuous at the scale of a minute. 
The spatial variable can be either geo- or network-location 
of a node. For simplicity, this work considers geo-location as 
a spatial variable to focus on location-based failures induced 
by severe weather. Nodes can be components in a distribu- 
tion system such as substations, feeders, hubs, transformers, 
transmission lines, and distributed energy sources. A shorthand 
notation i is used to specify the index of node i located at Zi. 
i <G S = {1,2, ...,n} for a power distribution network with 
n nodes. An underlying network topology is assumed to be 
radial so that cascading failures occurred in mesh networks 
are not considered. 

Let Xi(zi,t) be the status of the i-th node at time t > 
for 1 < i < n. We assume for simplicity that nodes only 
exhibit two states: Xi{zi 1 1) = 1 if the i-th node is in a failure 
mode, i.e., without power supply. Xi(zi,t) — if the node is 
in normal operation. Failures caused by external disturbances 
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Fig. 4: Histogram of failure occurrence time and the failure 
rate Xf(t) during Hurricane Ike. 



exhibit randomness. Whether and when a node fails is random. 
Whether and when a failed node recovers is also random. 
Hence, random processes can be used to characterize failure 
and recovery for all nodes in a network. 

Given time t > 0, P{Xi(zi,t + r) = 1} characterizes the 
probability that node i is failed in the near future t + r, where 
t > is a small time increment. Assume a node changes state, 
i.e., from failure to normal and vice versa. Then for the ith 
node, 1 < % < n, the probability that node i stays in failure 
mode in [t, t + t] is, 

P{Xi(zi,t + r) = 1} -P{Xi(zi,t) = 1} 

=P{X i (z i ,t + T) = l,X i (z i ,t)=0} (1) 

-P{Xi{zi,t + r) = 0,Xi{zi,t) = 1}. 

Equation [T] assumes Markov temporal dependence, and 
can be applied to n nodes in a distribution network. The n 
equations together form a geo-temporal model of a network. 
Note that statistically dependent failures at the small time scale 
less than a minute are not considered here, as such failures 
are often caused by an internal network structure rather than 
exogenous weather. Spatial dependence is embedded in the 
model but will be studied explicitly in subsequent work. 

B. Aggregated Geo-Temporal Process 

When large-scale failures are caused by one external dis- 
turbance, information available is from one "snapshot" of 
temporal spatial network statuses, and thus insufficient for 
specifying a complete temporal-spatial model at the node level. 
Hence, nodes are aggregated over a geographical region (Z), 
resulting in 



= J2 P{Xi(zi,t + T) = l,Xi(zi,t) = 0} (2) 

- J2 P{X i (z i ,t + T) = 0,X i {z i ,t) = l}. 
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Here P{Xi(zi,t) = 1} = E{I[Xi(zi,t) = 1]}, where I() 
is an indicator function. 1(A) = 1 if event A occurs, and 
1(A) = otherwise. We can define a geo-temporal process as 
follows. 

Definition: {N(t, Z) £ IN, t > 0} is a geo-temporal process 
where the spatial variables (i's) are aggregated for all nodes 
Zi in a predefined region Z. iV(£, Z) is the number of nodes 
in failure state at time t located in Z, 



N(t,Z) = J2 I[Xi(zi,t) = 1]. 

Combining Equations [2] and [5] we have, 

E{AN(t,Z)} = J2 P{Xi(zi,t + T) = l} 



(3) 
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J2 P{Xi( Zi ,t) = l}, 



(4) 
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where AN(t, Z) = 7V(t + r, Z) - JV(t, Z) is an increment of 
the number of failed nodes in a certain region. AN(t, Z) is the 
result of either newly-failed or newly-recovered nodes. Hence, 
we define a failure process and a recovery process respectively. 

Definition: Failure process {N f (t,Z) el,t>0}is the 
number of failures occurred up to time t. Recovery process 
{N r (t, Z) <E IN, i > 0} is the number of recoveries occurred 
up to time t. 

Assume r > is sufficiently small so that failure or 
recovery occurs at most once to a node during (t, t + r). The 
increments on a failure process and a recovery process satisfy 
respectively, 

E{AN f (t,Z)} = J2 P{X i (z i ,t + r) = l,X i (z i ,t) = 0}, 
E{AN r (t,Z)} = J2 P{X i (z u t + T)=0,X i (z i ,t) = l}, 
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(5) 



where AN f (t,Z) = N f (t + r,Z)} - N f (t,Z). Similarly, for 
a sufficiently small t > 0, it can be assumed that at most 
one recovery occurs during (t,t + r). Hence, Equation [2] is 
simplified as, 

E{AN(t,Z)} = E{AN f (t,Z)} - E{AN r (t,Z)}. (6) 

Furthermore, we assume at time to = 0, N(t,Z) = 0, 
Nf(t,Z) = 0, and N r (t,Z) — 0. Aggregating increments in 
Equation [6] from to t, we have, 



E{N(t,Z)} = E{N f (t,Z)} - E{N r (t,Z)}. 



(7) 



Hence, the expected number of nodes in the failure state 
equals to the difference between the expected failures and 
the expected recoveries. We now group a distribution network 
of n nodes into m geographical regions Zj, 1 < j < m, 
based on their geo-locations. A city, e.g., a subdivision, is an 
example of a geo-graphical region widely-used by utilities. 
Then the failure-recovery process for the entire distribution 
network N(t) is defined as, 

N(t) = [7V(i,Z!),iV(t,Z 2 ),...,7V(i,Z m )] T , ( 8 ) 

where N(t,Zj) characterizes how local power distribution in 



region Zj responds to an external disturbance. 

IV. Non-Stationary Failure and Recovery 

We now derive non-stationary characteristics on failure 
and recovery. Our derivation reveals pertinent quantities that 
completely model the behaviors of large-scale power failures 
and recoveries in expected values. This is pertinent to learning 
a small number of parameters in Section W\ 

A. Failure Process 

A failure process can be characterized to the first moment 
by failure rate functions. Let Xf (t) = [A/(i, Zi), A/(i,Z 2 ), ..., 
Xf(t, Z m )] T be a vector that consists of the rate function of a 
failure process, where A/ (t, Zj ) is the expected number of new 
failures per unit time at epoch t and region Zj, j = 1, 2, ..., m, 

X f (t,Z 3 ) = \im^E{N f (t + r,Z ] )-N f (t,Z J )}. (9) 

The larger Xf(t,Zj) is, the faster failures occur in Zj at time 
t. Xf(t,Zj) is referred to as the rate function of the failure 
process Nf(t,Zj). Hence, failure rate quantifies the inten- 
sity of failure occurrence. An non-stationary failure process 
has a time-varying intensity function A/(t, Zj) across geo- 
locations. Assuming a failure process begins at t — 0, we have 
E{N f (t)} = [ J B{7V / (i,Z 1 )},..., J E;{7V / (t,Z m )}] T , where 



E{N f (t,Zj)}= / X f (v,Zj)dv 



(10) 



for 1 < j < m. 



B. Recovery Process 

A recovery process can be characterized by recovery rate 
function X r (t), where X r (t) = [X r (t, Z\), A r (t, Z2), ..., 
X r (t,Z m )) T . X r (t,Zj) is the expected number of new recov- 
eries per unit time at epoch t and region Zj, 

X r (t,Zj) = lim -E{N r (t + t,Zj) - N r (t,Z )}. (11) 

T— >0 T 

An non-stationary recovery process Nf(t,Zj) has a time- 
varying rate function. Assuming the temporal failure process 
begins at t = 0, we have for 1 < j < m, 



E{N r (t,Zj)}= X r (v,Zj)dv. 
Jo 



(12) 



The recovery rate characterizes how rapidly recovery oc- 
curs, which is measured by failure duration D. For an 
non-stationary recovery process, a failure duration depends 
on when and where a failure occurs as illustrated in Fig- 
ure [2] Such non-stationarity of recovery is characterized by 
g(d\t,Zj) which is a conditional probability density function 
of failure duration D = d given failure time T = t at region 
Zj. For a given threshold do > 0, the conditional probability 
that a duration is bounded by d for failures occurred at time 
t is 



P{D <d \t,Z } 



g(v\t,Zj)dv. 



(13) 



When do is sufficiently small, this probability characterizes 
rapid recovery that occurs shortly after failures. For a given do, 
the larger P{D < do\t,Zj} is, the more rapid recovery dom- 
inates a recovery process. Given desired value of probability 
P{D < do\t,Zj}, the smaller do is, the more dominating the 
rapid recovery is. 

Rapid recovery is referred to as infant recovery. This ter- 
minology is borrowed from infant mortality in survivability 
analysis (25). Infant recovery is a desirable characteristic of 
the smart grid. In contrast, slow recovery is referred to as 
aging recovery in analogous to aging mortality [26|. Infant 
and aging recovery can be formally defined as follows. 

Definition: Let do > be a threshold value. If a node 
remains in failure for a duration less than do; a recovery is 
an infant recovery. Otherwise, the recovery is aging recovery. 
Infant recovery is characterized by P{D < do\t,Zj}. Aging 
recovery is characterized by P{D > do\t,Zf\. 

C. Joint Failure-Recovery Process 

A joint failure-recovery process characterizes an entire life 
cycle of a failure-recovery process (FRP), and represents the 
total number of nodes N(t,Z) in failure state at time t in 
region Z(Equation[3]l. The expected number of nodes in failure 
can be expressed in rate functions, 



of failures per unit time recover after t — s duration, i.e., 
the recovery rate by definition. Aggregating over all failures 



E{N(t,Zj)} 



\Xf{v,1ij) — X r (v,Zj)]dv. 



(14) 



Failure-and-recovery process can be viewed as a birth- 
death process. However, commonly used birth-death processes 
have a stationary distribution of failure duration and assume 
independence between failure occurrence t and failure duration 
d (27) . Here, these two assumptions do not hold. This implies 
that failures occurred at different time can last different 
duration. For example, under strong and sustained hurricane 
wind, failures that do not happen in day-to-day operation can 
occur due to falling debris and power lines. We shall further 
elaborate this through the real-life examples in Sections |V| and 

ED 

A recovery process is related to a failure process through a 
probability density function of failure durations. 

Theorem Let {JVy(t,Zj)} be an independent increment 
(failure) process with a rate function Xf(t,Zj), 1 < j < m. 
Let D(t) be the duration of a failur occurred at time t and 
region Zj. D(t) has a conditional probability density function 
g(d\t,lij), where d > 0, t > 0. Then recovery rate X r (t,Zf) 
satisfies 



X r {t,Zj)= g(t-s\s,Zj)Xf(s,Zj)ds, (15) 
Jo 

where l<j<.m, d = t — 8 with s and t being the failure 
time and recovery time respectively. 

The theorem is a corollary of the Transient Little's The- 
orem fT8). Intuitively, g(t — s\s,Zj)ds can be viewed as the 
probability that a failure occurred at time s and region Zj lasts 
t — s duration. g(t — s\s,Zj)dsXf(s, Zj) is the average number 



occurred prior to time t results in Equation 15 The detailed 
proof is given in (28). 

D. What to Learn 

What to learn now becomes apparent. Failure rate functions 
and probability density functions of recovery time completely 
specify our model to the first moment, i.e., 

. X f {t,\Zj), for 1 <j < m, 

• g{t — s\s, Zj), for 1 < j < m. 

In general, the forms and the parameters of these two 
functions are unknown, and need to be learned from real data. 
The learned functions and the parameters can then be used 
to estimate the empirical processes. The empirical processes 
are the sample means N(t,Zj), Nf(t,Zj), and N r (t,Zf) that 
estimate the true expectations E{N(t,Zj)}, E{N f {t,Zj)}, 
and E{N r (t,Zj)}, respectively. 

V. Hurricane Ike 

We first apply learning to a real-life example of large-scale 
utility-service disruptions caused by a hurricane. 

A. Data From Hurricane Ike 

Hurricane Ike was one of the strongest hurricanes occurred 
in 2008. Ike caused large scale power failures, resulting in 
more than 2 million customers without electricity, and marked 
as the second costliest Atlantic hurricane of all time (29) (30) . 

Reported by National Hurricane Center (31), the storm 
started to cause power failures across the onshore areas in 
Louisiana and Texas on September 12, 2008 prior to the land- 
fall. Ike then made a landfall at Galveston, Texas on 2:10 a.m. 
(CDT), September 13, 2008, causing strong winds, flooding, 
and heavy rains across Texas. The hurricane weakened to a 
tropical storm at 1:00 p.m. September 13 and passed Texas by 
2:00 a.m. September 14. 

A major utility provider collected data on power failures 
from more than ten cities. The failures include failed circuits, 
fallen poles and power lines, and non-operational substations. 
The raw data set has of 5152 samples. Each sample consists 
of the failure occurrence time (ti) and duration (di) of a 
component (i) in a distribution network from September 12 
through 14, 2008. The accuracy for time t is a minute. 

B. Data Processing 

The data set contains bursts of failures that occurred within a 
minute. As a minute is the smallest time scale for each sample, 
the bursts are considered as dependent failures. Dependent fail- 
ures are grouped as one failed entity (i), with a unique failure 
occurrence time ti and duration di. After such preprocessing, 
the resulting data set has 465 failed entities. Two outliers with 
negative failure duration are further removed. The remaining 
463 failed entities from 7 am September 12 to 4 am September 
14 are referred to as nodes. D — {ti, dj}|£i is the data set we 
use for learning. 
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Fig. 5: Empirical distribution of failure duration for failures Fig 6: Comparison between the joint failure-recovery process 
occurred during the landfall. N ^ from the data set and the reconstructed process N(t) 

using learned parameters. 



Spatial variables {Z^j's can be either chosen a priori or 
through learning from data. In this work, we choose {Z^j's 
to be small cities to include a natural living environment of 
customers and this method is widely-used by utility providers. 
There are 13 cities in the data set as illustrated in Figure 17] 



C. Temporal Failure Process 

We first study the temporal non-stationarity of the failure- 
and-recovery process. Spatial variables are aggregated across 
the entire network. This is equivalent to reducing multiple geo- 
graphical areas to one entire impact-region from the hurricane. 
Then the geo-temporal failure-recovery process reduces to a 
temporal process. For notational simplicity, spatial variables 
are omitted for temporal processes. 

The empirical rate function is estimated using a sim- 
ple algorithm based on moving average J32|: A/(t) = 

f „ — — — — , where r is chosen to be 5 hours. The 
resulting rate function is overlaid with the samples on the 
number of failures Nf (t) in Figure kl where each bin is of 
duration 1 hour. 

The learned failure rate function shows a time-varying rate 
of new failure occurrence: 

(a) Prior to 7 p.m. September 12, the rate was low, i.e., 
fewer than 5 new failures occurred per hour. Hence 5 per 
hour is considered as the failure rate in day-to-day operation. 

(b) At 7 p.m. September 12, the rate increased sharply first 
to 25 new failures per hour. In the next 6 hours, the rate 
reached the peak value of nearly 50 new occurrences per hour. 
This is consistent to the weather report |31| that the strong 
wind about 145 mph and flooding impacted the onshore areas 
prior to the landfall. The time of the peak coincides with the 
landfall at 2:10 a.m 9/13 CDT 

(c) After staying at the high level for about 12 hours (from 7 
p.m. September 12 to 7 a.m. September 13), the rate decreased 
rapidly back to a low level of less than 5 new failures per 
hours. 



D. Temporal Recovery Process 

We now learn the empirical recovery process characterized 
by g(d\t), the conditional probability density function of 
failure duration given failure occurrence time t. As the spatial 
aggregation removes the geo-location variables, g{d\t) is the 
conditional density function of failure duration of an entire 
network. 

We use the 463 samples on the failure durations and 
occurrences in our data set. These samples result in a joint 
empirical distribution g(d,t) in Figure [2] The height of each 
bin located at (t, d) represents the number of failures that 
occur at time t and last for duration d. Figure [2] shows non- 
stationarity of failure durations. For example, a large number 
(217) of failures occurred between 7 p.m. September 12 and 8 
a.m. September 13 lasted for more than a day. This indicates 
that many failures occurred during the surge of the hurricane 
were difficult to recover. Hence, a non-stationary distribution 
for g(d\t) is an appropriate assumption. 

Given failure occurrence time t, we observe that the distri- 
bution of duration is a combination of two components: Infant 
and aging recoveries. We thus select a mixture model for the 
probability density function g(d\t) where d > 0, 



id) 



g(d\t) = Y,Pj(t)9 1 (d\t), 



(16) 



i=i 



where l(t) is the number of mixtures at time t, Pj(t) (1 < 
j < I) is a weighting factor for the jth mixture function 
gj(d\t), and Y^Pjft) = 1- Weighting factor Pj(t) signifies the 
importance of the jth component gj(d\t). For a non-stationary 
recovery process, these parameters vary with failure time t. 

A mixture model is chosen since its parameters exhibit inter- 
pretable physical meaning [33] [34] [17|. A parametric family 
of Weibull mixtures is particularly appealing as the parameters 
correspond to infant and aging recovery directly. Weibull 
distributions have been widely used in survival analysis p6[ 
[25 1 and reliability theory [27], but not in characterizing 




Fig. 7: Geographical location of the 13 regions (cities). 
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cane Ike. Cities are sequenced with respect to the time when 
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recovery from large-scale external disturbances. Specifically, 
a Weibull distribution is 

d 

7(*) v tW 

where d > 0, k(t) and j(t) are the shape and scale param 
eters respectively. Hence, jth component in Equation 16 
g j (d\t) = w(d\P,'y j (t),k j (t)). 

Shape and scale parameters, k(t) and j(t), are pertinent 
for characterizing the type of recovery. The smaller k(t) and 
7(i) are, the faster the decay of g(d\t), the shorter the failure 
duration and thus the faster the recovery. Hence, k(t) < 1 
and moderate j(t) (e.g., j(t) ~ lO/i or smaller) correspond to 
infant recovery. k(t) > 1 and large -f(t) (e.g., *y(t) ~ lOO/i) 
correspond to aging recovery. 

For simplicity, we use a piecewise homogeneous function 
to approximate g(d\t). The failure time t is divided into 5 
intervals shown in Figure 15] Within interval ipi for 1 < i < 5, 
g(d\t € tpi) = gi(d) is assumed to be stationary that does not 
vary with failure time t. For different intervals, g(d\t £ V'z)' 8 
have different parameters for non-stationarity, 



g(d\t € fa) 



3 = 1 



Pi 



i(d;y, 



l ,3i ™i,j)- 



(18) 



The parameters of the Weibull mixtures within each inter- 
val are learned through maximum likelihood estimation [17] 
from the data. Failure durations obey different distributions 
for failures occurred at different intervals, showing the non- 
stationarity. For example, the first duration ipi (7 a.m. Septem- 
ber 12 to 7 p.m. September 12) is when the network was not 
yet impacted widely by Hurricane Ike. Three Weibull mixtures 
are learned from the data, with the shape, the scale and 
weighting parameters as (1,0.71,0.486), (10.5,14.4,0.257) 
and (10.7,211.8,0.257). The first two components result in 
dominating infant recovery, where 74.3% of failures recovered 
within a day. In contrast, the third duration ^3 (3 a.m. Septem- 
ber 13 to 3 p.m. September 13) is when the large-scale failures 
continued to occur after the landfall. Two Weibull mixtures 



are learned from the data. The shape, the scale and weight- 
ing parameters are (5.3,11.0,0.323) and (12.4,112.2,0.677), 
showing dominating aging recovery. As the result, only 32.2% 
of failures recovered within a day. The second duration fa 
(7 p.m. September 12 and 8 a.m. September 13) is around 
the hurricane landfall, where about a half of the failures 
occurred experienced infant recovery within a day (see Figure 
[5] for the three Weibull mixtures). For 5 durations overall, the 
probability of infant recovery within a day changes over time, 
showing the non-stationary of failure-recovery processes. 

We then reconstruct the empirical temporal failure-recovery 
process N(t) with learned A/(£) and A,-(£) through Equation 
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Figure k5^ shows the comparisons between N(t) and N(t), 
the reconstructed and the actual sample paths of the failure- 
recovery process repectively. The closeness between the two 
sample pathes shows that the piecewise stationary g{d\t) 
approximates well the actual failure-and-recovery process. 

E. Geo-Temporal Failure Process 

We now incorporate geo-location variables to learn the geo- 
temporal non-stationarity. Failure process Nf(t) is a geo- 
temporal process with multiple attributes Nf(t,Zj) from m 
geographical regions, 1 < j < m. The empirical failure rate 
functions Xf(t,Zj) for 1 < j < m are estimated using the 
same algorithm of moving average. The resulting rate vector 
A/(t) is multi-variate, consisting of m time-varying functions. 
Due to the small sample size, there are 6 out of 13 cities shown 
in Figure [7] each of which has sufficient samples ranging from 
27 to 101. Figure IS] shows the failure rates of the 6 cities. The 
multi-variate failure rates exhibit the following characteristics: 

(a) Temporal non-stationarity: At a given geographical re- 
gion Zj, Xf(t,Zj) is a time-varying function similar to the 
bell-shaped curve obtained for the entire network. Consider Z5 
as an example. The failure rate was low (few than 5 failures) 
prior to 7 p.m. September 12. Then, the rate increased sharply 
and reached the maximum value of 25 new failures per hour, 
at about 1 a.m. September 13. After that, the rate decreased 
rapidly to few than 5 failures. 
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(b) Spatial non-stationarity: At a given time t, Xf(t,Zj) is 
a spatially-varying function. The peak values of failure rates 
vary from 1.5 to 27 per hour across the 9 cities. The time 
when the rate reached the peak value varies between 8 p.m. 
September 12 to 7 a.m. September 13, and is depicted as a 
dashed line at the bottom in Figure [8] 

(c) Spatial temporal non-stationarity: The regions are then 
labeled with respect to the order of failure rates that reached 
the maximum value in Figure [8] For example, the failure 
rate at City Z4 reached the peak value first, followed by the 
failure rates at City Zi through City Zg. The figure shows 
the geo-temporal characteristic that failure rates at different 
city reached their peak values approximately from the coast 
to inland. This appears to be consistent to the movement of 
the hurricane track (Figure IT). 

F. Geo-Temporal Recovery Process 

To learn the geo-temporal non-stationary recovery, we ex- 



Applying the maximum likelihood estimation 1 17 1, we ob 



tend the mixture model (Equation 16 1 to a geo-temporal 
bivariate mixture, where for 1 < j < m, 



l(t,Zj) 

g(d\t,Z J )= J2 Pi(t,Zj)ffi(d|*,Z i ). 



(19) 



Again our learning focuses on the 6 cities with sufficient 
samples. Dependencies of failure durations among cities are 
not studied in this work because of the small sample size. 

We apply the piecewise homogeneous distribution function 



in Equation 18 to each region Zj, 



g(d\t €i/)i,z€Zj) = y^,Pc,i,j9C,i,j( d )- 



(20) 



C=i 



Here, each component g^ yi j(d) is a Weibull distribution 
w(d;j£.i,j,k(.i,j)- Mixture g(d\t e ipi,z £ Zj)'s and their 
coefficients vary with respect to not only failure occurrence 
time %l>i (temporal non-stationarity) but also geo-locations Z^'s 
(spatial non-stationarity). 



tain the estimated parameters of Weibull distributions in the 
6 cities. Note that due to the small sample size in some of 
the regions, the parameters of distributions of failure duration 
have to be assumed, in our implementation, not varying with 
failure occurrence time within a region. The probability of 
infant recoveries is also computed accordingly. Three cities 
(1, 4, 6) show a similar percentage of infant recovery from 
66% to 68% whereas the remaining cities (3, 5, 8) have 
infant recovery from 40% to 45%. Table [I] shows the learned 
model parameters for two example cities. Figure [9] shows the 
geographical distribution of infant and aging recoveries for the 
6 cities. 

The probability of infant recovery as well as model pa- 
rameters vary across different geographical regions, showing 
the spatial non-stationarity of the recovery process. Examining 
more details, adjacent cities (e.g., 1 and 3) that are close to 
the coast can exhibit different percentages of infant recovery. 
Faraway cities (e.g., city 8 which is far in land and city 5 which 
is close to the coast) can also exhibit a similar percentage of 
infant recovery. Hence, recovery processes seem to be complex 
and require further study. 

VI. Hurricane Sandy 

We now learn using real data from another real-life ex- 
ample of large-scale disruptions caused by Hurricane Sandy. 
This provides an understanding how our model and learning 
approach can be generalized to other hurricanes. 

A. Data 

Hurricane Sandy had a landfall at Northeastern United 
States on October 28, 2012. Hurricane Sandy resulted more 
than 6 million customers without electricity for days. The state 
with the most customers without power was New Jersey, where 
about 1.98 million customers lost power supplies (9). 

A utility company, reported the number of failures (outages) 
in more than 10 counties in New Jersey from October 28, 2012 
to November 22, 2012. The aggregated number of reported 
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TABLE I: Estimated parameters of distributions of failure 
durations in 2 cities. 



g(4?eZi) 


1 


2 


3 


P{d < 24} 


Pi,C 
7i,C 
fei.C 


0.3478 
0.0045 
0.2490 


0.3188 
12.1893 
2.7891 


0.3333 

197.0316 

3.7629 


66.63% 


g(d\z g z 3 ) 


1 


2 


3 


P{d < 24} 


P3,C 

73, C 

fc 3,C 


0.3000 
0.0650 
0.2897 


0.1500 
12.2138 
3.9992 


0.5500 

129.7408 

2.8037 


45.37% 



outages is a sample in our data set. Each sample consists of a 
given geo-location and time t at the scale of 15 minutes (the 
reporting interval). The geo-location variable Zj corresponds 
to a county in New Jersey for 1 < j < 14. The data set 
consists of 2275 such samples, i.e., {N(t, Zj)] 1 ^ 1 for time t 
from October 28 to November 22, 2012. Figure [TTJ a) plots the 
data. Note that such aggregated data does not provide accurate 
occurrence time nor duration of each power failure. 

B. Empirical Failure Process 

Learning now begins with the aggregated number of failures 
N(t,Zj) for 1 < j < 14, from which failure- and recovery- 
rates are estimated accordingly. This is a reverse process to 
learning from detailed failure data in Hurricane Ike. 

To learn the failure rate, we recall that A/(t) = ^E[Nf(t)] 
from Equation 
Equation 
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and X f (t) - A,.(£) = f t E[N(t)] from 
This suggests that a lower bound \fi(t) on the 
failure rate can be estimated from the aggregate number of 
failures at time t as 

d 



Xfi&Zj) = -NfaZj), 



if t = t* 



where t* is a time epoch when N(t*,Zj) increases. 

To determine how to obtain such an estimate, we examine 
characteristics of raw (time series) data N(t, Zj) at the county 
level. Figure [TO] shows two examples of the number of aggre- 
gated failures N (t, Zj) at two different counties in New Jersey. 
N(t,Zj) shows sharp increases and sharp decreases. A sharp 
increase occurs when the failure rate exceeds the recovery rate 
whereas a sharp decrease happens when recovery rate exceeds 
the failure rate. Hence, a change point in N(t, Zj) can be used 
to identify a lower bound for either a failure rate or a recovery 
rate. In addition, a sharp increase/decrease indicates a salient 
rather than noisy change point, where a lower bound can be 
obtained accurately. 

We first obtain the positive increments from N(t,Zj) for 
each region Zj using Equation 21 We then aggregate the 



increments over the 14 regions to obtain a lower bound A/j(t) 
for the failure rate of the utility network. N/(t), the estimated 
lower bound on the number of failures up to time t, can then 
be obtained by integrating Xfi(t), which is shown in Figure 

em 

C. Empirical Recovery Process 

To learn the empirical recovery rate, we apply Equation [2T| 
except that t* corresponds to the time epoch of a decrease in 

2 EST is used for plots in regard to Hurricane Sandy. 
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Fig. 11: Failure process and recovery process from Hurricane 
Sandy: (a) N(t), (b) X fl (t), (c) X rl (t). 



the number of failures. Figure fTTJc) shows an estimated lower 
bound X r i (t) for recovery rate and the cumulative number of 
recoveries N r (t) respectively. 

Since the aggregated data from Hurricane Sandy does not 
contain detailed recovery time for each failure, it is impossible 
to learn the time-varying distribution of failure duration g(d\t). 
Nevertheless, the aggregated data can be used to estimate 
a stationary distribution of recovery time, i.e., g(d). As the 
detailed information on failure duration is not available from 
the data, we consider a simple distribution with one Weibull 
mixture g(d; , y,k). Applying discrete samples to Theorem 



IV-C 



reconstructed recovery rate \ r i(t) can be related with 



(21) g(d;j,k) and Xfi(t) as 



X r i(i ■ 6) 






g{i ■ 5 - j ■ 6)X f i{j ■ 6)6, 



(22) 



where S = 15 minuets is the step size, and id is the discrete 
time. Weibull parameters 7 and k are then estimated to 



minimize the estimation error ||Arf(t) — A r ;(t)|| 2 . Figure 12 
shows the estimated Weibull distribution, where the shape 
parameter k = 1.3094 and the scale parameter 7 = 54.1684. 
The resulting stationary distribution of failure durations is 
then used to reconstruct a lower bound for the recovery rate. 
Figure 
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shows the estimated X r i(t) from the data set and 
the reconstructed X r i(t). Reconstructed X r i(t) thus provides a 
profile on how the recovery varies with time. 

VII. Findings and Discussions 

A. Findings 

Learning from Hurricane Ike and Hurricane Sandy results 
in the following findings. 

1) Failure process: Failure rates are time-varying for both 
Hurricane Ike and Hurricane Sandy. The corresponding failure 
processes are non-stationary in time and geo-graphical regions. 
However, the failure rates exhibit different characteristics at 
the county level for Hurricane Ike and Hurricane Sandy: 
The failure rates for Hurricane Ike appear to vary gradually. 
However, the failure rates for Hurricane Sandy exhibit sharp 
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Fig. 12: Weibull distribution for failure duration g(d). 

changes, showing that failures occurred in groups [j When 
aggregated over geographical regions, failure rates for both 
hurricanes exhibit similar characteristics, i.e., first rapidly 
increasing and then decreasing. More quantitative study is 
needed to further compare the failure processes for different 
hurricanes at different spatial scales. 

2) Recovery process: Learned recovery rates from Hur- 
ricane Dee and Hurricane Sandy are both time-varying. For 
Hurricane Dee, the learned probability distributions of failure 
durations exhibit non-stationarity in time and geo-locations, 
i.e., depend on when failures occur. Such distributions consti- 
tute both infant and aging recovery, as shown in Table II] and 
Figure [9] The degree of infant recovery, however, is different 
at different cities. Three out of the six chosen cities recovered 
more rapidly then the rest. Failures with infant and aging 
recoveries are also inter-leaving in geo-locations. 

The recovery for the provider network from Hurricane 
Sandy shows a nearly steady rate of 7000 recoveries per hour. 
In addition, the estimated Weibull distribution of the failure 
duration exhibits stronger aging recovery than infant recovery. 
A lack of infant recovery for this utility provider may indicate 
that power distribution networks suffered virulent disruptions 
during Hurricane Sandy. The recovery can thus be difficult. 
Yet, detailed rather than aggregated failure data is needed for 
accurately estimating distributions of failure durations. 

Note that failures and recoveries can occur simultaneously 
within a 15 minute interval. That is why the amount of 
increase in N(t,Zj) is a lower bound of the actual failure 
rate Xf(t, Zj), When the number of failures increased rapidly, 
e.g., from October 28 to October 31, recovery appeared to be 
minor. When the hurricane passed the area after October 31, 
recovery dominated. This is shown by the lower bounds of the 
failure- and the recovery-rate in Figure [TT] and [12] 

B. Discussions 

The type of available data is important for learning non- 
stationary behaviors of power distribution in response to 
external disruptions. The accurate failure data from Hurricane 

3 The cause shall be sought for when more detailed data becomes available. 



Ike characterizes an entire life cycle of failure and recovery 
processes. Data from Hurricane Sandy is aggregated and thus 
lack of exact information on individual failure occurrence and 
duration. Hence, learning is to infer failure- and recovery- 
processes, which is a reverse process to that for Hurricane 
Ike. The 15-minute sampling interval seems to be sufficient 
for estimating the lower bounds of failure- and recovery-rates 
from Hurricane Sandy. The aggregated data is insufficient for 
characterizing a non-stationary distribution of failure duration 
but can be used to learn a stationary distribution as an 
approximation. 

To deal with the small sample size, a rule of thumb is 
used where training samples should be several times more 
than parameters | [T7| . For Hurricane Ike, 20 or more samples 
seem to be sufficient for estimating temporal characteristics 
of failure- and recovery-rates but insufficient when the spatial 
non-stationarity is studied. This suggests that the algorithm 
need to be enhanced, e.g., to identify spatial scales appropriate 
for aggregation. 

Our model assumes an underlying radial topology, where 
failures can be considered as independent increments at large 
temporal spatial scales (minutes, cities). Detailed network 
configuration is yet to be included in our model. For ex- 



ample, topology and power flows [35| [36] are two possible 
characteristics to be included for failures and recoveries. 
Failure- and recovery-process at a small time scale of sub- 
seconds then need to be considered accordingly. A challenge 
is much increased complexity and in-network measurements 
at temporal spatial scales. 

VIII. Conclusion 

This work shows that non-stationary geo-temporal random 
processes naturally model large-scale failure and recovery of 
power distribution induced by hurricanes. In particular, mul- 
tivariate geo-location based GI(t)/G(t)/oo queues provide 
such non-stationary failure- and recovery processes. The non- 
stationary failure and recovery can be completely character- 
ized to the expected values by time-varying failure rate and 
probability distribution of recovery time across geo-graphical 
regions. 

Real data from two hurricanes have been used to learn 
failure and recovery processes. Learning detailed failure data 
from Hurricane Ike reveals that the failure process across 
different geographical regions follows a similar trend to that of 
the hurricane. However, the failure- and recovery-processes ex- 
hibit different infant and aging recovery across geo-graphical 
regions. Learning aggregated data from an impact area by 
Hurricane Sandy shows that our model can infer failure- and 
recovery rates using aggregated data. The failure rates have 
more significant discrete components for Hurricane Sandy 
than for Hurricane Ike at geographical regions. The recovery 
process is dominated by aging recovery for one utility network 
from Hurricane Sandy but consists of a significant component 
of infant recovery for another utility from Hurricane Ike. This 
shows that GI(t)/G(t)/oo model is indeed needed for general 
failure- and recovery -processes in dynamic queues. Note that 
these findings are for power distribution through open rather 
than underground networks. 
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These findings call for subsequent research on how dis- 
tributed power distribution are impacted by external distur- 
bances. For example, power failures and recoveries are yet to 
be studied at all impact areas for Hurricane Sandy. Spatial 
temporal dependencies among power distribution networks at 
different geographical regions need to be studied explicitly. 
This requires combining detailed configurations of power dis- 
tribution with the dynamic model. These studies shall provide 
further understanding on how to enhance the distributed power 
infrastructure. 
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